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ABSTRACT 

The frustratometer is an energy landscape theory- 
inspired algorithm that aims at quantifying the 
location of frustration manifested in protein mol- 
ecules. Frustration is a useful concept for gaining 
insight to the proteins biological behavior by 
analyzing how the energy is distributed in protein 
structures and how mutations or conformational 
changes shift the energetics. Sites of high local frus- 
tration often indicate biologically important regions 
involved in binding or allostery. In contrast, minim- 
ally frustrated linkages comprise a stable folding 
core of the molecule that is conserved in conform- 
ational changes. Here, we describe the implementa- 
tion of these ideas in a webserver freely available at 
the National EMBNet node— Argentina, at URL: 
http://lfp.qb.fcen.uba.ar/embnet/. 

INTRODUCTION 

The energy landscape theory of protein folding is based on 
a statistical description of a protein's potential energy 
surface. Globally, the most realistic model of a protein is 
a minimally frustrated heteropolymer with a rugged 
funnel-like landscape biased toward the native state (1). 
This statistical description has been developed using tools 
from the statistical mechanics of disordered systems, poly- 
mers and phase transitions of finite systems. Natural 
proteins, as we observe them today, are highly evolved 
complex systems. Self-assembly of and mutual recognition 
between these polypeptides leading to reasonable well- 
defined structural ensembles is a fundamental concept in 



the biology of macromolecules. The specificity of folding 
and binding is captured by the 'Principle of minimal frus- 
tration' (2). This principle states that the general energy of 
the protein decreases more than what may be expected by 
chance as the protein assumes conformations progres- 
sively more like the ground (native) state. In other 
words, there is a strong energetic bias toward the native 
basin that overcomes both the asperities of the landscape 
which stabilize kinetic traps and also ultimately the 
entropy of the chain. It has been shown that the structures 
of transition state ensemble (3,4), the folding rate 
variations (5), the existence of folding intermediates (6), 
dimerization mechanisms (7) and domain swapping events 
(8) are often well predicted in models where energetic frus- 
tration has been removed from the model landscape and 
topological information of the native state is the sole 
input. Still, inhomogeneity in the native contacts ener- 
getics, non-native interactions and the residual local frus- 
tration present in the native ensemble do contribute to the 
functional characteristics of proteins, 'molding' the rough- 
ness that underlies the detailed protein dynamics (9,10). 

Local frustration 

The principle of minimal frustration does not rule out that 
some energetic frustration may be present in a folded 
protein. Moreover, the remaining frustration may not be 
random but evolved, facilitating motion of the protein 
around its native basin, and as such the residual frustra- 
tion may be fundamental to protein function (9,10). 
Theoretical methods allow for spatially localizing and 
quantifying the energetic frustration present in native 
protein structures by developing a spatially local version 
of the global gap criterion formulation of the minimal 
frustration principle (11). This algorithm compares the 
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contribution to the extra stabilization energy ascribed to a 
given pair of amino acids in the native protein to the stat- 
istics of the energies that would be found by placing dif- 
ferent residues in the same native location or by creating a 
different environment for the interacting pair. If there is a 
sufficient additional stabilization for an individual native 
pair as normalized by the typical energy fluctuation 
(in accord with the global Z-score criterion for minimal 
frustration) the local interaction can be called 'minimally 
frustrated'. If the stabilization of the native pair lies in the 
middle of the distribution of alternatives, the interaction 
can be considered 'neutral'. On the other hand, if the 
native pair is sufficiently destabilizing compared with the 
other possibilities, the interaction is 'highly frustrated'. 
Details of the method and the energy functions can be 
found in references (11,12,13) and at the server documen- 
tation pages. 

MATERIALS AND METHODS 

Input files 

The main input file contains a set of protein structure co- 
ordinates in the standard format of the Protein Data Bank 
(http://www.rcsb.org). Users can upload their own struc- 
ture file or provide the four-letter code for existing PDB 
entries, in which case the server will automatically 
download the data. The coordinates are checked for 
overall formatting and if there is more than one amino 
acid chain in the model, the user is asked to specify the 
chains to process. A dialogue box and an interactive Jmol 



interface are provided to assist in this process. The user 
can specify as many chains as wanted. Jobs are automat- 
ically accepted for up to 1000 amino acid residues 
complexes. Users can optionally provide an e-mail 
address to receive a notification of job completion. 

Server calculations 

The server automatically applies filters that remove hydro- 
gens, heteroatoms and alternative conformations for 
residues. If the input file contains multiple models, only 
the first one is analyzed by default. The most common 20 
amino acids are taken into account and if backbone or 
C-beta atoms are missing, they are automatically built 
into the file using an automodel option of Modeller suite 
(http://salilab.org/modeller/). The jobs are assigned a 
JobID, organized in a run queue and processed as com- 
putational resources become available. A typical run of a 
200 residue protein takes about 5 minutes of CPU time 
and about 60 minutes for 500 residues complexes. 

Outputs 

A results page is generated for each job. These pages can 
be accessed by following the link sent by e-mail 
(if provided) or by specifying the JobID . The server gen- 
erates several projections of the local frustration calcula- 
tions (Figure 1). An interactive Jmol applet facilitates 
inspection of the structures for which the minimally 
frustrated and highly frustrated contacts were identified. 
This information is also plotted as a contact map. Linear 
projections of local frustration distributions are also 
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Figure 1. Frustratometer server output. An example of the localized frustration and minimally frustrated networks in protein structures (pdb: 
2FCK) Left: the protein backbone is displayed as blue ribbons, the direct inter-residue interactions with solid lines and the water-mediated inter- 
actions with dashed lines. Minimally frustrated interactions are shown in green, highly frustrated contacts in red, neutral contacts are not drawn. 
Right: Projection of local frustration distribution in amino acid sequence. The number of contacts within 5A of the C-alpha of each residue is 
plotted, as classified according to their frustration index. 
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provided and can be enlarged in the same page. Results 
can be downloaded by following the link at the bottom of 
the same page. The download pack includes the input file 
as processed by the core algorithm, scripts to interactively 
visualize frustratographs in Jmol, PyMOL or VMD 
programs, together with the raw tables of the frustration 
index calculations and an explanatory README file. The 
results are accessible for 30 days and permanently deleted 
afterward. 



RESULTS 

Interpreting results 

The frustration index measures how favorable a particular 
contact is relative to the set of all possible contacts in that 
location normalized using the variance of that distribu- 
tion. For initial inspection, the server classifies the individ- 
ual contacts as to their frustration index value. A contact 
is defined as 'minimally frustrated' (drawn green), if its 
native energy is at the lower end of the distribution of 
decoy energies, having a frustration index of 0.78 or 
higher magnitude (11), that is, the majority of (but by 
no means all!) other amino acid pairs in that position 
would be unfavorable. Conversely, a contact is defined 
as 'highly frustrated' (drawn red), if the native energy is 
at the other end of the distribution with a local frustration 
index lower than — 1 , that is, most other amino acid pairs 
at that location would be more favorable for folding than 
the native ones by more than one standard deviation of 
that distribution. If the native energy is in between these 
limits, the contact is defined as 'neutral' (drawn grey or 
not shown). 

A frustration index may depend on the choice of parts 
in which the protein's whole energy is divided. It, there- 
fore, becomes natural to divide the energy up in a way that 
is at least roughly comparable to what natural selection 
can do: examine the changes in energy on making muta- 
tions. The webserver provides two complementary ways 
for localizing frustration that differ in how the set of 
decoys is constructed: 

Mutational frustration ( How favorable are the native 
residues relative to other residues in that location?) 

The decoy set is made randomizing the identities of the 
interacting amino acids, keeping all other interaction par- 
ameters at their native value. This scheme effectively 
evaluates every possible mutation of the amino acid pair 
that forms a particular contact in a robustly fixed struc- 
ture. It is worth noting that the energy change on a residue 
pair mutation not only comes directly from the particular 
contact probed but also changes through interactions of 
each residue with other residues not in the mutated pair, 
and those contributions will also vary on mutation. 

Configurational frustration (How favorable are the native 
interactions between two residues relative to other 
interactions these residues can form in other 
compact structures?) 

This way of measuring frustration imagines that the 
residues are not only changed in identity but also may 



be displaced in location. The energy variance thus 
reflects contributions to different energies of other 
compact conformations. In this calculation, the decoy 
set involves randomizing not just the identities and also 
the distance and densities of the interacting amino acids. 
This scheme effectively evaluates the native pair with 
respect to a set of structural decoys that might be encoun- 
tered in the folding process. 

Case studies 

A survey of nonredundant protein domains shows that 
natural protein domains are strongly crosslinked by min- 
imally frustrated contact networks comprising about 40% 
of the total contacts (11). Only a minority (~10%) of the 
native interactions are found to be 'highly frustrated', and 
these typically cluster at the protein surface (11). The re- 
maining 50% are 'neutral' and are randomly distributed in 
the structure. The highly frustrated interactions that, in 
principle, might conflict with the robust folding of the 
domain seem to reflect evolutionary constraints other 
than folding and often correspond to physiologically 
relevant sites. A statistical survey shows that these sites 
do co-localize with regions involved in the formation of 
heterodimeric protein assemblies (11). A survey on the 
local frustration patterns of allosteric domains shows 
that the regions that reconfigure are often enriched in 
patches of highly frustrated interactions (12), consistent 
with the idea that these locally frustrated regions may 
'crack' in these locations (14). On the other hand, the 
symmetry of multimeric protein assemblies allows near 
degeneracy by reconfiguring while maintaining minimally 
frustrated interactions (12). In addition, highly frustrated 
regions found in the native ensemble have been found to 
contribute to the stabilization of folding intermediates 
(15,16). In a similar spirit, PROSA web service displays 
Z-scores and energy plots that highlight potential conflicts 
spotted in protein structures (17). 

Concluding remarks 

Natural protein domains must be sufficiently stable to fold 
but often also need to be locally unstable to function. The 
possibility of localizing and quantifying the energetic frus- 
tration present in protein molecules allows one to probe 
lower hierarchies of the energy landscape manifested as 
the exploration of the configurational substates defined 
by the local roughness. Molding of this roughness can 
have profound effects on the structural transitions and is 
thus likely to have functional consequences. Particular 
examples of frustratographs can be very interesting, but 
results should be taken carefully, as in some cases, the 
highly frustrated regions may not correspond to the 
known active sites. Performing statistical surveys of 
homologs, mutants, etc. are encouraged. The server is sup- 
ported by a Documentation section, including a 
quick-start guide and walkthrough with screenshots. A 
gallery of frustratographs with examples is also hosted 
at the site, together with fully interactive outputs that 
are linked from the help pages. We also host a FAQs 
section and personalised support is provided via e-mail 
request. 
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